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Abstract 

Compressed sensing is a new methodology for constructing sensors which allow sparse signals to be 
efficiently recovered using only a small number of observations. The recovery problem can often be stated 
as the one of finding the solution of an underdetermined system of linear equations with the smallest possible 
support. The most studied relaxation of this hard combinatorial problem is the h -relaxation consisting of 
searching for solutions with smallest Zi-norm. In this short note, based on the ideas of Lagrangian duality, 
we introduce an alternating h relaxation for the recovery problem enjoying higher recovery rates in practice 
than the plain h relaxation and the recent reweighted h method of Candes, Wakin and Boyd. 

1 Introduction 

Compressed Sensing (CS) is a very recent field of fast growing interest and whose impact on concrete applications 
in coding and image acquisition is already remarkable. Up to date informations on this new topic may be 
obtained from the website \http:/ /nuit-blanc he. blogs pot. com/\ The foundational paper is [1] where the main 
problem considered was the one of reconstructing a signal from a few frequency measurements. Since then, 
important contributions to the field have appeared; see [5] for a survey and references therein. 

1.1 The Compressed Sensing problem 

In mathematical terms, the problem can be stated as follows. Let x be a fc-sparse vector in K ra , i.e. a vector 
with no more than k nonzero components. The observations are simply given by 

y = Ax (1-1-1) 

where A € R mxn and to small compared to n with rankA = to, and the goal is to recover x exactly from these 
observations. The main challenges concern the construction of observation matrices A which allow to recover 
x with k as large as possible for given values of n and to. 

The problem of compressed sensing can be solved unambiguously if there is no sparser solution to the linear 
system (11.1. II) than x. Then, recovery is obtained by simply finding the sparsest solution to (jl.l.ip . If for any 
x in K" we denote by ||a;|]o the Zo _n orm of x, i.e. the cardinal of the set of indices of nonzero components of x, 
the compressed sensing problem is equivalent to 

min || a? || o s.t. Ax = y. (1-1-2) 

We denote by A (y) the solution of problem (|1.1.2[) and A (y) is called a decoder 0. Thus, the CS problem 
may be viewed as a combinatorial optimization problem. Moreover, the following lemma is well known. 

Lemma 1.1.1 (See for instance [2]) If A is any m x n matrix and 2k < to, then the following properties are 
equivalent: 

i. The decoder Aq satisfies Aq(Ax) = x, for all x € £fc, 

ii. For any set of indices T with #T = 2k, the matrix At has rank 2k where At stands for the submatrix 
of A composed of the columns indexed by T only. 
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1 \n the general case where x is not the unique sparsest solution of HI. 1.21 1 using this approach for recovery is of course possibly 
not pertinent. Moreover, in such a case, this problem has several solutions with equal io-"norm" and one may rather define Ao(y) 
as an arbitrary element of the solution set. 



1.2 The li relaxation 



The main problem in using the decoder Ao (y) for given observations y is that the optimization problem (jl. 1 .2|) 
is NP-hard and cannot reasonably be expected to be solved in polynomial time. In order to overcome this 
difficulty, the original decoder Ao(y) has to be replaced by simpler ones in terms of computational complexity. 
Assuming that A is given, two methods have been studied for solving the compressed sensing problem. The 
first one is the orthognal matching pursuit (OMP) and the second one is the ^-relaxation. Both methods are 
not comparable since OMP is a greedy algorithm with sublinear complexity and the ^-relaxation offers usually 
better performances in terms of recovery at the price of a computational complexity equivalent to the one of 
linear programming. More precisely, the l\ relaxation is given by 

min ||a;||i s.t. Ax = y. (1.2.1) 

set 11 

In the following, we will denote by A\(y) the solution of the Zi-relaxation (|1.2.1|) . From the computational 
viewpoint, this relaxation is of great interest since it can be solved in polynomial time. Indeed, (jl .2. 1[) is 
equivalent to the linear program 

n 

min > Zi s.t. — z < x < z, and Ax = y. 

i—1 

The main subsequent problem induced by this choice of relaxation is to obtain easy to verify sufficient conditions 
on A for the relaxation to be exact, i.e. to produce the sparsest solution to the underdetermined system (jl.l.ip . 
A nice condition was given by Candes, Romberg and Tao pQ and is called the Restricted Isometry Property. Up 
to now, this condition could only be proved to hold with great probability in the case where A is a subgaussian 
random matrix. Several algorithmic approaches have also been recently proposed in order to garantee the 
exactness of the l\ relaxation such as in [5] and [7]. The goal of our paper is different. Its aim is to present 
a new method for solving the CS problem generalizing the original /i-relaxation of ([1]) and with much better 
performance in pratice as measured by success rate of recovery versus original sparsity k. 



2 Lagrangian duality and relaxations 

2.1 Equivalent formulations of the recovery problem 

Recall that the problem of exact reconstruction of sparse signals can be solved using Ao and Lemma li.l .11 Let 
us start by writing down problem (|1.1.2p . to which Ao is the solution map, as the following equivalent problem 

max e t z (2.1.1) 

z, x£]BL n 

subject to 

ZiXi = 0, Zi(z{ — 1) = i = l,...,n, and Ax — y 

where e denotes the vector of all ones. Here since the sum of the ZiS is maximized, the variable z plays the 
role of an indicator function for the event that Xi = 0. This problem is clearly nonconvex due to the quadratic 
equality constraints ZiX\ — 0, i = 1, . . . , n. 

2.2 The standard Semi-Definite Programming (SDP) relaxation scheme 

A simple way to construct a SDP relaxation is to homogenize the quadratic forms in the formulation at hand 
using a binary variable Zq = 1. Indeed, by symmetry, it will suffice to impose Zq — 1 since, if the relaxation 
turns out to be exact and a solution (zq,z,x) is recovered with zq = — 1, then, as the reader will be able to 
check at the end of this section, (— zq, —z, —x) will also solve the relaxed problem. For instance, problem (|2.1.ip 
can be expressed as 

max e t zzQ (2-2.1) 

subject to 

ZiXi = 0, Zi(zi — zq) = and zqAx = y 

for i = 1, . . . ,n, Zq = 1. 



If we choose to keep explicit all the constraints in problem (|2.2.ip , the Lagrange function can be easily be 
written as 

L S dp(w, A, /i, v) = w l Qw + Yh=i \w t C l w 

+ Yh=1 ^iW t E l W + V W t E W 

i TTl t, A t 

+ Lj=i VjiifAjW - v l y, 

where w is the concatenation of zq, z, x into one vector, A (resp. fi and v) is the vector of Lagrange multipliers 
associated to the constraints ZiXi = 0, i = 1, . . . ,n (resp. Zi{z{ — Zo), i = 1, . . . ,n, and ZoOjX = yj, j = 1, . . . , m) 
and where all the matrices Q, Aj, j = 1, . . . , m, i = 1, . . . , n and Cj = 1, . . . , n belong to §2n+ii the se t of 
symmetric 2n + 1 x 2n + 1 real matrices and are defined by 
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for j = 1, . . . , m, where a\ is the j row of A 
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for i = 1, . . . , n where e, is the vector with all components equal to zero except the i th which is set to one, e 
is the vector of all ones, D(ei) is the diagonal matrix with diagonal vector et and where Okj denotes the k x I 
matrix of all zeros. The dual function is given by 



?sdp(A,/j,!/)= sup L(w,\,n,v), 



and thus 



-v*y if Q(A, (j,, v) < 
-co otherwise 



with 



Q(X, fi, v) = w t Qw + X l w t C l w + fiiW t E i w + i / jW t Aj 

3=1 



i=l i=0 

and where >; is the Lowner ordering (^4 ^ i3 iff A — i? is positive semi-definite). Therefore, the dual problem 
is given by 

inf 6sDp{\^y), 
which is in fact equivalent to the following semi-definite program 



inf 



subject to 



Q(\,H,v) * 0. 



(2.2.2) 
(2.2.3) 



We can also try and formulate the dual of this semi-definite program which is called the bidual of the initial 
problem. This bidual problem is easily seen after some computations to be given by 



max trace (QX) 



(2.2.4) 



subject to 



trace(Aj-X) = yj, j = 1, . . . ,ra, 
trace(So^) = 1, 
tr&ce(EiX) = and trace(CiX) = 0, i = 1, 



(2.2.5) 



Now, if X* is an optimal solution with rank(X*) = 1, then 
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and it can be easily verified that all the constraints in (|2.2.1j) are satisfied. Moreover, we may additionally 
impose that Zq = 1q However, the following proposition ruins the hopes for the occurance of such an agreable 
situation. 

Proposition 2.2.1 Ifnon empty, the solution set of the bidual problem pDO J is not a singleton and it contains 
matrices with rank equal to n — m. 

Proof. Consider the subspace Wo of R 2n+1 as the set of vectors whose n+ 1 first coordinates are equal to zero 
and such that the last n coordinates form a vector in the kernel of A. Since we assumed that rankA = m, we 
have that dimVKo = n — m. Assume that there exists a solution X* to (I2.2.4|) with rank less than or equal to 
n — m — 1. Then, it is possible to find a vector w in Wo with w f _L Pw Q (Range(X*)). On the other hand, one 
can easily check that X** = X* + ww* satisfies all the bidual constraints and has the same objective value as 
X*. Thus, X** is also a solution of the bidual problem and rankX** = rankX* + 1. Iterating the argument 
up to matrices of dimension equal to n — 1, we obtain that the solution set contains matrices with rank equal 
to n — m. To prove non uniqueness of the solution, for any solution matrix X* , set X*** = X* + ww for any 
choice of w in Wo and X*** is also a solution of the bidual problem. □ 

2.3 Comments on the SDP relaxation 

Despite the powerfull Lagrangian methodology behind its construction, the SDP relaxation of the problem has 
three major drawbacks: 

• as implied by Proposition 12 . 2 . 1] the standard SDP relaxation scheme leads to solutions which naturally 
have rank greater than one which makes it hard to try and recover a nice primal candidate. Moreover, 
even if the rank problem could be overcome in practice in the case where x is sparse enough, by adding 
more ad hoc constraints in the SDP, finding the most natural way to do this seemed quite non trivial to 
us. 

• in the case where the SDP has a duality gap, proposing a primal suboptimal solution does not seem to 
be an easy task. 

• the computational cost of solving Semi-Definite Programs is much greater than the cost of solving our 
naive relaxation, a fact which may be important in real applications. 

2.4 An utopic relaxation 

In order to overcome the drawbacks of the SDP relaxation, we investigate another scheme which may look 
utopic at first sight. Notice that one interesting variant of formulation (|2.1.1[) could be the following in which 
the nonconvex complementarity constraints are merged into the unique constraint ||_D(z)x||i = 

max e l z %X.\Dlz)x\ x = 0, Ax = y. (2-4.1) 

z6{0,l}" 

Choosing to keep the constraints Ax = y and z G {0, 1}" implicit in (|2.4.1|) . the Lagrangian function is given 

by 

L(x,z,u) = e l z - u\\D(z)x\\ 1 (2.4.2) 

where D(z) is the diagonal matrix with diagonal vector equal to z. The dual function (with values in RU +oo) 
is defined by 

9{u) = max L(x,z,u) (2.4.3) 

zG{0,l}'\ Ax=y 

and the dual problem is 

inf0(tt). (2.4.4) 

The main problem with the dual problem (|2.4.4p is that the solutions to (|2.4.3[) are as difficult to obtain as the 
solution of the original problem (|2.4.1|) because of the nonconvexity of the Lagrangian function L. 
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Indeed, if Zq = —1, multiply by —1 the whole vector [zq,z*,x*] 



3 The Alternating l\ method 



We now present a generalization of the l\ relaxation which we call the Alternating li relaxation with better 
experimental performances than the standard li relaxation and the SDP relaxation. 

3.1 A practical alternative to the utopic relaxation 

Due to the difficulty of computing the dual function 9 in the relaxation 12.41 the interest of this scheme seems 
at first to be of pure theoretical nature only. In this section, we propose a suboptimal but simple alternating 
minimization approach. 

When we restrict z to the value z — e, i.e. the vector of all ones, solving the problem 

X(u) = argmax z=6) a . eK „ i Ax=y L(x, z, u) (3.1.1) 

gives exactly the solution Ai(y) of the l\ relaxation. From this remark, and the Lagrangian duality theory 
above, it may be supected that a better relaxation can be obtained by trying to optimize the Lagrangian even 
in a suboptimal manner. 



Algorithm 1 Alternating l\ algorithm (Alt-Zi) 

Require: u > and L G N* 

(o) _ 

Z u — 6 

x { u ] G argmax 2 , eR „ ! Ax=y L(x, z^\u) 
I = 1 

while I < L do 

(0 r- Tl CO \ 

Zu G argmax ze | 01 j„L(xi l ',z,u) 

xV G argmax a , effi „ j Ax=y L{x, z' u ' ,u) 
Z<-Z + l 
end while 

Output z^ and x^u\ 



At each step, knowing the value of z u implies that optimization with respect to x G K n can be equivalently 
restricted to the set of variables Xi which are indexed by the i's associated with the values of Zu which are 
equal to one. Thus, the choice of zffl corresponds to adaptive support selection for the signal to recover. 

The following lemma states that z u is in fact the solution of a simple thresholding procedure. 

Lemma 3.1.1 For all x in W 1 , any solution z of 

max L(x,z,u) (3.1.2) 

«G[Q,l] n 

satisfies that Zi = 1 if \xi\ < if \Xi\ > - and G [0, 1] otherwise. 

Proof. Problem (|3.1.2p is clearly separable and the solution can be easily computed coordinatewise. □ 



4 Monte Carlo experiments 

In this section, using Monte Carlo experiments, we compare our Alternating l\ approach to two recent methods 
proposed in the litterature: the Reweighted 1% of Candes, Wakin and Boyd [3] and the Iteratively Reweightcd 
Least-Squares as proposed in [B]. The problem size was chosen to be the same as in Chartrand and Yin's paper 
[6 : n = 256, m = 100. For each sparsity k level a hundred different fc-sparse vectors x were drawn as follows: 
the support was chosen uniformly on all support with cardinal k and the nonzero components were drawn from 
the Gaussian distribution A/"(0, 4). The,n, the observation matrix was obtained in two steps: first draw a m x n 
matrix with i.i.d. Gaussian Af(0, 1) entries and then normalize each column to 2 as in [BJ. 

The parameter u, namely the Lagrange multiplier for the complementarity constraint was tuned as follows: 
since on the one hand the natural breakdown point for Iq/1\ equivalence, i.e. equivalence of using Iq vs. 
li minimization, lies around k = 22 and on the other hand, the Alternating l\ is nothing but a successive 
thresholding algorithm due to Lemma r3.1.1l we decided to chose the smallest possible u so that the 22 largest 
components x u °^ the first step of the Alternating l\ algorithm (which is nothing but the plain l\ decoder whatever 



the value of u) be over — . Notice that this value of u is surely not the solution of the dual problem but our choice 
is at least motivated by reasonable deduction based on pratical observations whereas the tuning parameter in 
the other two algorithms is not known to enjoy such an intuitive and meaningful selection rule. We chose L = 4 
in these experiments. The numerical results for the IRLS and the Reweighted l\ were communicated to us by 
Rick Chartrand whom we greatly thank for his collaboration. 

Comparison between the success rates the three methods is shown in Figure 1. Our Alternating l\ method 
outperformed both the Iteratively Reweighted Least Squares and the Reweighted methods for the given data 
size. As noted in [BJ, the IRLS and the Reweighted l\ enjoy nearly the same exact recovery success rates. 




Figure 1: Rate of success over 100 Monte Carlo experiments in recovering the support of the signal vs. signal 
sparsity k for n = 256, m = 100, L = 4. 



Remark. The Reweighted l\ and the Reweighted LS both need a value of e (or even a sequence of such 
values as in [5]) which is hard to optimize ahead of time, whereas the value u in the Alternating l\ is a Lagrange 
multiplier, i.e. a dual variable. In the Monte Carlo experiments of the previous section, we decided to base 
our choice of u on a simple an intuitive criterion suggested by the well known experimental behavior of the 
plain l\ relaxation. On the other hand, it should be interesting to explore duality a bit further and perform 
experiments in the case where u is approximately optimized (using any derivative free procedure for instance) 
based on our heuristic alternating l\ approximation of the dual function 9. 
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